Syllabification Algorithm based on Syllable Rules Matching for Malay Language

نویسنده

  • RABIAH A. KADIR
چکیده

In this paper, we present a new syllabification algorithm for Malay language. Syllabification is the process to extract or divide syllable from words. Syllabification process is language dependent where each language can have its own set of syllable structure. Syllabication is an important component in speech synthesizer, speech recognition and transliteration system. Syllabification algorithms have been proposed in many languages including English, Spanish, Myanmar, Singhala, Chinese and ect. Unfortunately, there are not many information regarding evaluation of syllabification scheme for Malay. In this paper, we propose an efficient algorithm based on syllable rules matching. In order to evaluate the algorithm, a prototype has been developed to measure the accuracy of syllabification. We evaluate our method using Bernama, Kamus Dewan and Overlap data collection. The syllable rules matching achieved 60.7% accuracy on BERNAMA collection, 77.4% on Kamus Dewan Collection and 71.6% on Overlap collection. Key-words: Syllabification, Text-to-Speech, Syllable Matching, Speech Synthesizer, Elicitation

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Rule Based Algorithm for Automatic Syllabification of a Word of Bodo Language ISSN 2319 - 2720

The process of syllabification performs the task of Identifying syllables in a word. The correct Syllabification rules and algorithms are mainly used in text-to-speech system to improve naturalness of the synthesized speech. This paper presents a study of Bodo syllable structure and linguistic rules for syllabification as well. An algorithm has been developed for automatic syllabification of Bo...

متن کامل

Automatic Syllabification Rules for Bodo Language

Syllabification performs the task of Identifying syllables in a word or in a sentence. Most of the syllabification tasks are done manually. As the syllabification rules vary from language to language so it is difficult to design a common syllabification rules or algorithm to fit all the languages. On the other hand Syllabification rules are the basic backbone for any task related to text-to-spe...

متن کامل

Syllable structure in Old, Middle and Modern Persian: A contrastive analysis

Evolution of languages has always been of interest to linguists.  In this paper we study  the natural progress of the syllable structure from Old  Persian  (O.P)  to Middle Persian (Mi.P) and up to the Modern Persian (Mo.P). For this purpose all the words containing consonant sequences are collected from specific sources of each  of these  languages,  and then  analysed  according to the syllab...

متن کامل

Syllabification rules versus data-driven methods in a language with low syllabic complexity: The case of Italian

Linguistic rules have been assumed to be the best technique for determining the syllabification of unknown words. This has recently been challenged for the English language where data-driven algorithms have been shown to outperform rule-based methods. It may be possible, however, that data-driven methods are only better for languages with complex syllable structures. In this study, three rule-b...

متن کامل

A Rule Based Syllabification Algorithm for Sinhala

This paper presents a study of Sinhala syllable structure and an algorithm for identifying syllables in Sinhala words. After a thorough study of the Syllable structure and linguistic rules for syllabification of Sinhala words and a survey of the relevant literature, a set of rules was identified and implemented as a simple, easy-to-implement algorithm. The algorithm was tested using 30,000 dist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011